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1 Garba ge collection: A t rue hardware read barr ier 
Matthias Meyer 

June 2006 Proceedings of the 5th international symposium on Memory management 
ISMM '06 

Publisher: ACM 

Full text available: fj?] pdf(9 91 .66 KB ) Additional Information: full citation, abstract, references, index terms 



Read barriers synchronize compacting garbage collection and application processing in a 
simple yet elegant way. Unfortunately, read barrier checks are expensive to implement in 
software, and even with hardware support, the clustering of read barrier faults irregularly 
impairs application progress to an unacceptable extent. For this reason, read barriers are 
often considered unsuitable for hard real-time systems.In this paper, we introduce a novel 
hardware read barrier design for an object-based ... 

Keywords: hardware support, object-based processor architecture, read barrier, real- 
time garbage collection 



The case for a read barrier 
Douglas Johnson 

April 1991 ACM SIGPLAN Notices , ACM SIGARCH Computer Architecture News , ACM 
SIGOPS Operating Systems Review , Proceedings of the fourth 
international conference on Architectural support for programming 
languages and operating systems ASPLOS-IV, volume 26 , 19 , 25 issue 4,2, 

Special Issue 
Publisher: ACM Press 

Full text available: l g|pdf (881 .24 KB) Additional Information: full citation , references, citings, index terms 



Implementation techniq ues : Barriers: friend or foe? 
Stephen M. Blackburn, Antony L. Hosking 

October 2004 Proceedings of the 4th international symposium on Memory 
management ISMM '04 

Publisher: ACM Press 

Full text available: Ifl pdf(1 37 JO KB) Additional Information: M citation, abstract, refeiences, cjtings, index 
L - '* " " terms 

Modern garbage collectors rely on read and write barriers imposed on heap accesses by 
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the mutator, to keep track of references between different regions of the garbage 
collected heap, and to synchronize actions of the mutator with those of the collector. It 
has been a long-standing untested assumption that barriers impose significant overhead 
to garbage-collected applications. As a result, researchers have devoted effort to 
development of optimization approaches for elimination of unnecessary ... 

Keywords: garbage collection, Java, memory management, write barriers 



Barrier techniques for incremental tracing 
Pekka P. Pirinen 

October 1998 ACM SIGPLAN Notices , Proceedings of the 1st international symposium 

on Memory management ISMM *98, volume 34 issue 3 
Publisher: ACM Press 

Full text available* [ rp pdf(707 56 KB) Add ' t ' ona ' Information: f ull citation , abstract, r eferences , citings, index 

" terms 

This paper presents a classification of barrier techniques for interleaving tracing with 
mutator operation during an incremental garbage collection. The two useful tricolour 
invariants are derived from more elementary considerations of graph traversal. Barrier 
techniques for maintaining these invariants are classified according to the action taken at 
the barrier (such as scanning an object or changing its colour), and it is shown that the 
algorithms described in the literature cover all the poss ... 

JrnplemenMjon techniques: Exploring Jthebarrierto entry: incremental g e n e r at i on al 

garbage collection for Haskell 

A. M. Cheadle, A. J. Field, S. Marlow, S. L. Peyton Jones, R. L. While 

October 2004 Proceedings of the 4th international symposium on Memory 

management ISMM '04 
Publisher: ACM Press 

Full text available: « pjf(458.55 KB) Additional Information: full cjtatiQn, abstract, references, citings, index 
" terms 

We document the desi n and implementation of a "production" incremental garbage 
collector for GHC 6. 2. It builds on our earlier work (Non-stop Haskell)that exploited GHC's 
dynamic dispatch mechanism to hijack object code pointers so that objects in to-space 
automatically scavenge themselves when the mutator attempts to "enter" them. This 
paper details various optimisations based on code specialisation that remove the dynamic 
space,and associated time, overheads that accompanied our earlier sch ... 

Keywords: incremental garbage collection, non-stop haskell 



6 Efficieo t techn iques for fast nested barrier synchronization 
Vara Ramakrishnan, Isaac D. Scherson, Raghu Subramanian 

July 1995 Proceedings of the seventh annual ACM symposium on Parallel algorithms 

and architectures SPAA '95 
Publisher: ACM Press 

Full text available: l g| pdf(806_,60 KB) Additional Information: full citation, references, citings, index terms 




7 Concurrency: Write barrier elision for concurrent garbage collectors 

Martin T. Vechev, David F. Bacon 
^ October 2004 Proceedings of the 4th international symposium on Memory 
management ISMM '04 

Publisher: ACM Press 
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Full text available: l g| pdf( 490.73 KB) Additional Information: full citation , abstract , reference s, citing s, index 

terms 

Concurrent garbage collectors require write barriers to preserve consistency, but these 
barriers impose significant direct and indirect costs. While there has been a lot of work on 
optimizing write barriers, we present the first study of their elision in a concurrent 
collector. We show conditions under which write barriers are redundant, and describe how 
these conditions can be applied to both incremental update or snapshot-at-the-beginning 
barriers. We then evaluate the potential for write b ... 

Keywords: concurrent garbage collection, write barrier 



Enforcing isola t ion and orde r i n g in STM 

Tatiana Shpeisman, Vijay Menon, Ali-Reza Adl-Tabatabai, Steven Balensiefer, Dan Grossman, 
Richard L. Hudson, Katherine F. Moore, Bratin Saha 

June 2007 ACM SIGPLAN Notices , Proceedings of the 2007 ACM SIGPLAN conference 
on Programming language design and implementation PLDI '07, volume 42 

Issue 6 
Publisher: ACM Press 

Full text available: t g) pdf(257.39 KB) Additional Information: full citation, abstract, references, index terms 

Transactional memory provides a new concurrency control mechanism that avoids many 
of the pitfalls of lock-based synchronization. High-performance software transactional 
memory (STM) implementations thus far provide weak atomicity: Accessing shared data 
both inside and outside a transaction can result in unexpected, implementation-dependent 
behavior. To guarantee isolation and consistent ordering in such a system, programmers 
are expected to enclose all shared-memory accesses inside tr ... 

Keywords: code generation, compiler optimizations, escape analysis, isolation, ordering, 
strong atomicity, transactional memory, virtual machines, weak atomicity 



9 An effective hybrid transactional memory syst em with stron g isolation guarantees 
^ Chi Cao Minh, Martin Trautmann, JaeWoong Chung, Austen McDonald, Nathan Bronson, 
^ Jared Casper, Christos Kozyrakis, Kunle Olukotun 

June 2007 ACM SIGARCH Computer Architecture News , Proceedings of the 34th 

annual international symposium on Computer architecture ISCA '07, volume 

35 Issue 2 
Publisher: ACM Press 

Full text available: 'gj pdf( 239 . 24 KB ) Additional Information: full citation, abstract, references, index terms 

We propose signature-accelerated transactional memory (SigTM), ahybrid TM system that 
reduces the overhead of software transactions. SigTM uses hardware signatures to track 
the read-set and write-set forpending transactions and perform conflict detection between 
concurrent threads. All other transactional functionality, including dataversioning, is 
implemented in software. Unlike previously proposed hybrid TM systems, SigTM requires 
no modifications to the hardware caches, which reduces hardw ... 

Keywords: multi-core architectures, parallel programming, strong isolation, transactional 
memory 



10 An on-the-fly refere nce -countin g g arba g e collector for java 
Yossi Levanoni, Erez Petrank 

January 2006 ACM Transactions on Programming Languages and Systems (TOPLAS), 

Volume 28 Issue 1 
Publisher: ACM Press 
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Full text available: ^_rM(Z£LJjLKB). Additional Information: full citation , abstract , references , index terms 

Reference-counting is traditionally considered unsuitable for multiprocessor systems. 
According to conventional wisdom, the update of reference slots and reference -counts 
requires atomic or synchronized operations. In this work we demonstrate this is not the 
case by presenting a novel reference-counting algorithm suitable for a multiprocessor 
system that does not require any synchronized operation in its write barrier (not even a 
compare-and-swap type of synchronization).. A second novelty of thi ... 

Keywords: Programming languages, garbage collection, memory management, 
reference-counting 



1 1 Sapphire: copying^ 

Richard L. Hudson, J. Eliot B. Moss 

June 2001 Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande JGI 
'01 

Publisher: ACM Press 



Full text available* «Slpdf(899 45 KB). Additiona, Information: f ull citation , abstract , references , citing s, index 
• [aj = terms 

Many concurrent garbage collection (GC) algorithms have been devised, but few have 
been implemented and evaluated, particularly for the Java programming language. 
Sapphire is an algorithm we have devised for concurrent copying GC. Sapphire stresses 
minimizing the amount of time any given application thread may need to block to support 
the collector. In particular, Sapphire is intended to work well in the presence of a large 
number of application threads, on small- to medium-scale shared memor ... 

12 Ob j ects and their collection: The pauseless GC al g orithm 
Cliff Click, Gil Tene, Michael Wolf 

June 2005 Proceedings of the 1st ACM/USENIX international conference on Virtual 
execution environments VEE '05 

Publisher: ACM Press 

Full text available- A3 Ddf(440 91 KB) Additional Information: M citation, abstract, reference s, citings, index 
' ^ terms 

Modern transactional response-time sensitive applications have run into practical limits on 
the size of garbage collected heaps. The heap can only grow until GC pauses exceed the 
response-time limits. Sustainable, scalable concurrent collection has become a feature 
worth paying for.Azul Systems has built a custom system (CPU, chip, board, and OS) 
specifically to run garbage collected virtual machines. The custom CPU includes a read 
barrier instruction. The read barrier enables a highly concurren ... 

Keywords: Java, concurrent GC, custom hardware, garbage collection, memory 
management, read barriers 



1 3 A real-time garbage collector with l o w overhead and consistent utilization 
David F. Bacon, Perry Cheng, V. T. Rajan 

January 2003 ACM SIGPLAN Notices , Proceedings of the 30th ACM SIGPLAN-SIGACT 
symposium on Principles of programming languages POPL '03, volume 38 

Issue 1 

Publisher: ACM Press 

Full text available' "PI odf(51 7 37 KB) Additional Information: full citation, a bstract , referen c es, citin gs, index 
' B ' terms 

Now that the use of garbage collection in languages like Java is becoming widely accepted 
due to the safety and software engineering benefits it provides, there is significant 
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interest in applying garbage collection to hard real-time systems. Past approaches have 
generally suffered from one of two major flaws: either they were not provably real-time, 
or they imposed large space overheads to meet the real-time bounds. We present a 
mostly non-moving, dynamically defragmenting collector that overco ... 

Keywords: defragmentation, read barrier, real-time scheduling, utilization 



14 Past barrier synchroniz ation hardwa re 

Carl J. Beckmann, Constantine D. Polychronopoulos 

November 1990 Proceedings of the 1990 ACM/IEEE conference on Supercomputing 
Supercomputing '90 

Publisher: IEEE Computer Society 

Full text available: I g] pdf(984.65 KB) Additional Information: f ull citation , abstrac t, references 

Many recent studies have considered the importance of barrier synchronization overhead 
on parallel loop performance, especially for large-scale parallel machines. This paper 
describes a hardware scheme for supporting fast barrier synchronization. It allows barrier 
synchronization to be performed within a single instruction cycle for moderately sized 
systems, and is scalable with logarithmic increase in synchronization time. It supports a 
large number of concurrent barriers, and can also be used ... 

1 5 Arch itectu raj Supp ort fo r Software Transactional Memory 

Bratin Saha, Ali-Reza Adl-Tabatabai, Quinn Jacobson 

December 2006 Proceedings of the 39th Annual IEEE/ACM International Symposium 
on Microarchitecture MICRO 39 

Publisher: IEEE Computer Society 

Full text available: pdf( 325.24 KB) Additional Information: full citation, abstract, index terms 

Transactional memory provides a concurrency control mechanism that avoids many of the 
pitfalls of lock-based synchronization. Researchers have proposed several different 
implementations of transactional memory, broadly classified into software transactional 
memory (STM) and hardware transactional memory (HTM). Both approaches have their 
pros and cons; STMs provide rich and flexible transactional semantics on stock processors 
but incur significant overheads. HTMs, on the other hand, provide high ... 

1 6 Optimizati on and real time GC: Mark-swee p or copying ?: a "best o f b oth worlds" 
algorithm and a hardware-su p ported real-time implementation 

Sylvain Stanchina, Matthias Meyer 

October 2007 Proceedings of the 6th international symposium on Memory 
management ISMM '07 

Publisher: ACM 

Full text available: l g]pdf(294.73 KB) Additional Information: f ull citation , abstract , references, index terms 

Copying collectors offer a number of advantages over their mark-sweep counterparts. 
First, they do not have to deal with mark stacks and potential mark stack overflows. 
Second, they do not suffer from unpredictable fragmentation overheads since they 
inherently compact the heap. Third, the tospace invariant maintained by many copying 
collectors allows for incremental compaction and provides the basisfor efficient real-time 
implementations. Unfortunately, however, standard copying collectors de ... 

Keywords: hardware support, mark-compact collection, object-based processor 
architecture, real-time garbage collection 

17 Optimizati on and real time GC: Sto pl ess : a rea l-time garbag e collector for 
multiprocessors 
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Filip Pizlo, Daniel Frampton, Erez Petrank, Bjarne Steensgaard 

October 2007 Proceedings of the 6th international symposium on Memory 

management ISMM '07 
Publisher: ACM 

Full text available: l g| pdf( 303.94 KB) Additional Information: full citation, ab st r act , referen ces, inde x te rms 

We present Stopless: a concurrent real-time garbage collector suitable for modern 
multiprocessors running parallel multithreaded applications. Creating a garbage-collected 
environment that supports real-time on modern platforms is notoriously hard, especially if 
real-time implies lock-freedom. Known real-time collectors either restrict the real-time 
guarantees to uniprocessors only, rely on special hardware, or just give up supporting 
atomic operations (which are crucial for lock-free software ... 

Keywords: concurrency, garbage collection, lock-free, real-time 



18 Impact of Java Memory Model on Out-of-Order Multiprocessors 
Tulika Mitra, Abhik Roychoudhury, Qinghua Shen 

September 2004 Proceedings of the 13th International Conference on Parallel 
Architectures and Compilation Techniques PACT '04 

Publisher: IEEE Computer Society 

Full text available: ^pdf d 68.66 KB) Additional Information: full citation, abstract 

The semantics of Java multithreading dictates all possible behaviors that a multithreaded 
Java program can exhibit on any platform. This is called the Java Memory Model (JMM) 
and describes the allowed reorderings among the memory operations in a thread. 
However, multiprocessor platforms traditionally have memory consistency models of their 
own. In this paper, we study the interaction between the JMM and the multiprocessor 
memory consistency models. In particular, memory barriers may have to be i .... 

19 Implicit coscheduling: coordinated sched u ling with implicit information i n distributed 
<g> systems 

^ Andrea Carol Arpaci-Dusseau 

August 2001 ACM Transactions on Computer Systems (TOCS), volume 19 issue 3 

Publisher: ACM Press 

r- „ * * -i u. es»i jf/H oo */.m Additional Information: f ull citatio n, abstr act, ref erences , citin gs, index 

Full text available: to £df(1 .8 3 MB ) 3 ~ 

a " . ' terms 

In modern distributed systems, coordinated time-sharing is required for communicating 
processes to leverage the performance of switch-based networks and low-overhead 
protocols. Coordinated time-sharing has traditionally been achieved with gang scheduling 
or explicit coscheduling, implementations of which often suffer from many deficiencies: 
multiple points of failure, high context-switch overheads, and poor interaction with client- 
server, interactive, and I/O -intensive workloads. I ... 

Keywords: clusters, coscheduling, gang scheduling, networks of workstations, 
proportional-share scheduling, two-phase waiting 



20 Faster hi g h-level lan guag e virtual machines: Automatic feedback-directed ob ject 
<g> inlining in the java hotsp ot™ virtual machine 
^ Christian Wimmer, Hanspeter Mossenbock 

June 2007 Proceedings of the 3rd international conference on Virtual execution 
environments VEE '07 

Publisher: ACM Press 

Full text available: e g|pdf(3 41.49 KB) Additional Information: full citation , abstr act, references , index terms 
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Object inlining is an optimization that embeds certain referenced objects into their 
referencing object. It reduces the costs of field accesses by eliminating unnecessary field 
loads. The order of objects in the heap is changed in such a way that objects that are 
accessed together are placed next to each other in memory so that their offset is fixed, 
i.e. the objects are colocated. This allows field loads to be replaced by address arithmetic. 
We implemented this optimization for ... 
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object inlining, optimization, performance 
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We present Stopless: a concurrent real-time garbage collector suitable for modern 
multiprocessors running parallel multithreaded applications. Creating a garbage-collected 
environment that supports real-time on modern platforms is notoriously hard, especially if 
real-time implies lock-freedom. Known real-time collectors either restrict the real-time 
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atomic operations (which are crucial for lock-free software ... 
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Now that the use of garbage collection in languages like Java is becoming widely accepted 
due to the safety and software engineering benefits it provides, there is significant . 
interest in applying garbage collection to hard real-time systems. Past approaches have 
generally suffered from one of two major flaws: either they were not provably real-time, 
or they imposed large space overheads to meet the real-time bounds. We present a 
mostly non-moving, dynamically defragmenting collector that overco ... 
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Garbage collectors incorporating concurrent marking to cope with large live data sets and 
stringent pause time constraints have become common in recent years. The snapshot-at- 
the-beginning style of concurrent marking has several advantages over the incremental 
update alternative, but one main disadvantage: it requires the mutator to execute a 
significantly more expensive write barrier. This paper demonstrates that a. large fraction 
of these write barriers are unnecessary, and may be eliminated by ... 
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In the last three decades a large number of compiler transformations for optimizing 
programs have been implemented. Most optimizations for uniprocessors reduce the 
number of instructions executed by the program using transformations based on the 
analysis of scalar quantities and data-flow techniques. In contrast, optimizations for high- 
performance superscalar, vector, and parallel processors maximize parallelism and 
memory locality with transformations that rely on tracking the properties o ... 
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In this paper, we introduce the IBM® WebSphere® Real Time product, which incorporates 
a virtual machine that is fully Java™ compliant as well as compliant with the Real-Time 
Specification for Java (RTSJ). We describe IBM's real-time Java enhancements, 
particularly in the area of our Testarossa (TR) ahead-of-time (AOT) compiler, our TR just- 
in-time (JIT) compiler, and our Metronome[2] deterministic Garbage Collector (GC). The 
main focus of this paper is on the various techniques employed by ... 
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Reference-counting is traditionally considered unsuitable for multiprocessor systems. 
According to conventional wisdom, the update of reference slots and reference-counts 
requires atomic or synchronized operations. In this work we demonstrate this is not the 
case by presenting a novel reference-counting algorithm suitable for a multiprocessor 
system that does not require any synchronized operation in its write barrier (not even a 
compare-and-swap type of synchronization). A second novelty of thi ... 



http://portal.acm.org/results.cfa 11/28/2007 



Results (page 1): david bacon read barrier 



Page 3 of 7 



Keywords: Programming languages, garbage collection, memory management, 
reference-counting 



Concurrenc y: Wri te barrier elision for conc urrent garbage collectors 
Martin T. Vechev, David F. Bacon 

October 2004 Proceedings of the 4th international symposium on Memory 
management ISMM '04 

Publisher: ACM Press 

Full text available- "pCl pdf(490 73 KB) Addit ' onal Information: full citation , abstract , references , citing s, index 
Iaj ' terms 

Concurrent garbage collectors require write barriers to preserve consistency, but these • 
barriers impose significant direct and indirect costs. While there has been a lot of work on 
optimizing write barriers, we present the first study of their elision in a concurrent 
collector. We show conditions under which write barriers are redundant, and describe how 
these conditions can be applied to both incremental update or snapshot-at-the-beginning 
barriers. We then evaluate the potential for write b ... 
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Reference counting is not naturally suitable for running on multiprocessors. The update of 
pointers and reference counts requires atomic and synchronized operations. We present a 
novel reference counting algorithm suitable for a multiprocessor that does not require any 
synchronized operation in its write barrier (not even a compare-and-swap type of 
synchronization). The algorithm is efficient and may complete with any tracing algorithm. 
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A reference-counting garbage collector cannot reclaim unreachable cyclic structures of 
objects. Therefore, reference-counting collectors either use a backup tracing collector 
infrequently, or employ a cycle collector to reclaim cyclic structures. We propose a new 
concurrent cycle collector, one that runs concurrently with the program threads, imposing 
negligible pauses (of around 1ms) on a multiprocessor. 

Our new collector combines a state-of-the-art cycle collector [Bacon and R ... 
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Real-time systems have reached a level of complexity beyond the scaling capability of the 
low-level or restricted languages traditionally used for real-time programming. While 
Metronome garbage collection has made it practical to use Java to implement real-time 
systems, many challenges remain for the construction of complex real-time systems, 
some specific to the use of Java and others simply due to the change in scale of such 
systems. The goal of our current research is the creation of a comprehe ... 
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Concurrent garbage collectors are notoriously hard to design, implement, and verify. We 
present a framework for the automatic exploration of a space of concurrent mark-and- 
sweep collectors. In our framework, the designer specifies a set of "building blocks" from 
which algorithms can be constructed. These blocks reflect the designer's insights about 
the coordination between the collector and the mutator. Given a set of building blocks, 
our framework automatically explores a space of algorithms ... 
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12 Syncopation: generational real-time garbage collection in the metronome 

David F. Bacon, Perry Cheng, David Grove, Martin T. Vechev 

June 2005 ACM SIGPLAN Notices , Proceedings of the 2005 ACM SIG PLAN /SIG BED 
conference on Languages, compilers, and tools for embedded systems 
LCTES '05, Volume 40 Issue 7 
Publisher: ACM Press 

Full text available: f 5 !! pdf(21 2.34 KB), Additional Information: full citation, abstract, references, citings, index 

~~ ~~~~ terms 

Real-time garbage collection has been shown to be feasible, but for programs with high 
allocation rates, the utilization achievable is not sufficient for some systems. Since a high 
allocation rate is often correlated with a more high-level, abstract programming style, the 
ability to provide good real-time performance for such programs will help continue to raise 
the level of abstraction at which real-time systems can be programmed. We have 
developed techniques that allow generational collection to ... 
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Tracing and reference counting are uniformly viewed as being fundamentally different 
approaches to garbage collection that possess very distinct performance properties. We 
have implemented high-performance collectors of both types, and in the process observed 
that the more we optimized them, the more similarly they behaved - that they seem to 
share some deep structure. 

We present a formulation of the two algorithms that shows that they are in fact duals of 
each other. Intuitively, the ... 
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Commercial Java virtual machines are designed to maximize the performance of 
applications at the expense of predictability. High throughput garbage collection 
algorithms, for example, can introduce pauses of 100 milliseconds or more. We are 
interested in supporting applications with response times in the tens of microseconds and 
their integration with larger timing-oblivious applications in the same Java virtual 
machine. We propose Reflexes, a new abstraction for writing highly responsive sys ... 
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If the operating system could be specialized for every application, many applications 
would run faster. For example, Java virtual machines (JVMs) provide their own threading 
model and memory protection, so general-purpose operating system implementations of 
these abstractions are redundant. However, traditional means of transforming existing 
systems into specialized systems are difficult to adopt because they require replacing the 
entire operating system. This paper describes Libra, an execut .... 
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With concurrent and garbage collected languages like Java and C# becoming popular, the 
need for a suitable non-intrusive, efficient, and concurrent multiprocessor garbage 
collector has become acute. We propose a novel mark and sweep on-the-fly algorithm 
based on the sliding views mechanism of Levanoni and Petrank. We have implemented 
our collector on the Jikes Java Virtual Machine running on a Netfinity multiprocessor and 
compared it to the concurrent algorithm and to the stop-the-world collecto ... 
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<i>Garbage-First</i> is a server-style garbage collector, targeted for multi-processors 
with large memories, that meets a soft real-time goal with high probability, while 
achieving high throughput. Whole-heap operations, such as global marking, are 
performed concurrently with mutation, to prevent interruptions proportional to heap or 
live-data size. Concurrent marking both provides collection "completeness" and identifies 
regions ripe for reclamation via compacting evacuation. This ev ... 
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The emergence of standards for programming real-time systems in Java has encouraged 
many developers to consider its use for systems previously only built using C, Ada, or 
assembly language. However, the RTSJ standard in isolation leaves many important 
problems unaddressed, and suffers from some serious problems in usability and safety. 

As a result, the use of Java for real-time programming has continued to be viewed as 
risky and adoption has been slow. 

In this paper we provide a ... 
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Reference-counting is traditionally considered unsuitable for multiprocessor systems. 
According to conventional wisdom, the update of reference slots and reference-counts 
requires atomic or synchronized operations. In this work we demonstrate this is not the 
case by presenting a novel reference-counting algorithm suitable for a multiprocessor 
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We present a new algorithm for eliminating null pointer checks from programs written in 
Java™. Our new algorithm is split into two phases. In the first phase, it moves null 
checks backward, and it is iterated for a few times with other optimizations to eliminate 
redundant null checks and maximize the effectiveness of other optimizations. In the 
second phase, it moves null checks forward and converts many null checks to hardware 
traps in order to minimize the execution cost of the remai ... 
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We present a new algorithm for eliminating null pointer checks from programs written in 
Java™. Our new algorithm is split into two phases. In the first phase, it moves null 
checks backward, and it is iterated for a few times with other optimizations to eliminate 
redundant null checks and maximize the effectiveness of other optimizations. In the 
second phase, it moves null checks forward and converts many null checks to hardware 
traps in order to minimize the execution cost of the remai ... 
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Read barriers synchronize compacting garbage collection and application processing in a 
simple yet elegant way. Unfortunately, read barrier checks are expensive to implement in 
software, and even with hardware support, the clustering of read barrier faults irregularly 
impairs application progress to an unacceptable extent, For this reason, read barriers are 
often considered unsuitable for hard real-time systems. In this paper, we introduce a novel 
hardware read barrier design for an object-based ... 
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Garbage collectors incorporating concurrent marking to cope with large live data sets and 
stringent pause time constraints have become common in recent years. The snapshot-at- 
the-beginning style of concurrent marking has several advantages over the incremental 
update alternative, but one main disadvantage: it requires the mutator to execute a 
significantly more expensive write barrier. This paper demonstrates that a large fraction 
of these write barriers are unnecessary, and may be eliminated by ... 
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readily take advantage of service threads for enhancing performance by performing tasks 
such as profile collection and analysis, dynamic optimization, and garbage collection 
concurrently with program execution. In this context, a hardware-assisted profiling 
mechanism is proposed. The Relational Profiling Architecture (RPA) is designed from the 
top down. RPA is based on a relational model similar ... 

8 Compilation Techniques for R eal-T i me Java Programs Q 
Mike Fulton, Mark Stoodley 

March 2007 Proceedings of the International Symposium on Code Generation and 
Optimization CGO '07 

Publisher: IEEE Computer Society 

Full text available:^ pdf( 275.22 KB) Additional information: full cit ation , abstract , ind ex terms 

In this paper, we introduce the IBM® WebSphere® Real Time product, which incorporates 
a virtual machine that is fully Java™ compliant as well as compliant with the Real-Time 
Specification for Java (RTSJ). We describe IBM's real-time Java enhancements, 
particularly in the area of our Testarossa (TR) ahead-of-time (AOT) compiler, our TR just- 
in-time (JIT) compiler, and our Metronome[2] deterministic Garbage Collector (GC). The 
main focus of this paper is on the various techniques employed by ... 
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Concurrent garbage collectors are notoriously hard to design, implement, and verify. We 
present a framework for the automatic exploration of a space of concurrent mark-and- 
sweep collectors. In our framework, the designer specifies a set of "building blocks" from 
which algorithms can be constructed. These blocks reflect the designer's insights about 
the coordination between the collector and the mutator. Given a set of building blocks, 
our framework automatically explores a space of algorithms ... 
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We present DITTO, an automatic incrementalizer for dynamic, side-effect-free data 
structure invariant checks. Incrementalization speeds up the execution of a check by 
reusing its previous executions, checking the invariant anew only the changed parts of the 
data structure. DITTO exploits properties specific to the domain of invariant checks to 
automate and simplify the process without restricting what mutations the program can 
perform. Our incrementalizer works for modern imperative languages ... 

Keywords: automatic, data structure invariants, dynamic optimization, 
incrementalization, optimistic memoization, program analysis 
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11 Anatomy of LISP 
John Allen 
January 1978 Book 

Publisher: McGraw-Hill, Inc. 

Additional Information: full citation, abstract , refe renc es, cit ed by, in dex t e rms 

This text is nominally about LISP and data structures. However, in the process it covers 
much broader areas of computer science. The author has long felt that the beginning 
student of computer science has been getting' a distorted and disjointed picture of the 
field. In some ways this confusion is natural; the field has been growing at such a rapid 
rate that few are prepared to be judged experts in all areas of the discipline. The current 
alternative seems to be to give a few introductory cou ... 

1 2 The multics system: an exami n ation of its structure 
Elliott I. Organick 
January 1972 Book 

Publisher: MIT Press 

Additional Information: full citation , a bstract , reference s, cited b y. index terms 

This volume provides an overview of the Multics system developed at M.I.T.— a time- 
shared, general purpose utility like system with third-generation software. The advantage 
that this new system has over its predecessors lies in its expanded capacity to manipulate 
and file information on several levels and to police and control access to data in its 
various files. On the invitation of M.I.T.'s Project MAC, Elliott Organick developed over a 
period of years an explanation of the workings, concep ... 




13 An on-the-fly reference counting garbage collector for Java 

A Yossi Levanoni, Erez Petrank 

v 7 October 2001 ACM SIGPLAN Notices, Proceedings of the 16th ACM SIGPLAN 

conference on Object oriented programming, systems, languages, and 

applications OOPSLA '01; Volume 36 Issue 11 
Publisher: ACM Press 

Full text available* l p| pdf(280 30 KB) Additional Information: f ull citation , abstract, references, citings , inde x 
■ ' terms 

Reference counting is not naturally suitable for running on multiprocessors. The update of 
pointers and reference counts requires atomic and synchronized operations. We present a 
novel reference counting algorithm suitable for a multiprocessor that does not require any 
synchronized operation in its write barrier (not even a compare-and-swap type of 
synchronization). The algorithm is efficient and may complete with any tracing algorithm. 

14 An on-the-fly mark and sweep garbage collector based on sliding views 
^ Hezi Azatchi, Yossi Levanoni, Harel Paz, Erez Petrank 

October 2003 ACM SIGPLAN Notices , Proceedings of the 18th annual ACM SIGPLAN 
conference on Object-oriented programing, systems, languages, and 
applications OOPSLA '03, Volume 38 Issue 11 
Publisher: ACM Press 

Full text available* pdf(244 12 KB) Additional Information: full citation , abstract, references , citings, index 
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With concurrent and garbage collected languages like Java and C# becoming popular, the 
need for a suitable non-intrusive, efficient, and concurrent multiprocessor garbage 
collector has become acute. We propose a novel mark and sweep on-the-fly algorithm 
based on the sliding views mechanism of Levanoni and Petrank. We have implemented 
our collector on the Jikes Java Virtual Machine running on a Netfinity multiprocessor and 
compared it to the concurrent algorithm and to the stop-the-world collecto ... 
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Keywords: concurrent garbage collection, garbage collection, memory management, on- 
the-fly garbage collection, runtime systems 

15 Guide for the us e of the Ada Ravens c ar Profil e in hi g h inte grit y systems 
<H> Alan Burns, Brian Dobbing, Tullio Vardanega 
>^ June 2004 ACM SIGAda Ada Letters, volume xxiv issue 2 
Publisher: ACM Press 

Full text available: *g| pdf(5 48.17 KB ) Additional Information: f ull citation , references 



16 Adaptive techniques: Optimistic stack allocation for java-like languages 
Erik Corry 

June 2006 Proceedings of the 5th international symposium on Memory management 
ISMM '06 

Publisher: ACM 

Full text available: t jg pdf(1 55.2 3 KB) Additional Information: fu ll cit a tion, abstract, re feren ces, index terms 

Stack allocation of objects offers more efficient use of cache memories on modern 
computers, but finding objects that can be safely stack allocated is difficult, as 
interprocedural escape analysis is imprecise in the presence of virtual method dispatch 
and dynamic class loading. We present a new technique for doing optimistic stack 
allocation of objects. Our technique does not require interprocedural analysis and is 
effective in the presence of dynamic class loading, reflection and exception han ... 

Keywords: Java, garbage collection, stack allocation 



17 Com piler construction: an advanced course 

F. L. Bauer, F. L. De Remer, M. Griffiths, U. Hill, J. J, Horning, C. H. A. Koster, W. M. 
McKeeman, P. C. Poole, W. M. Waite, G. Goos, J. Hartmanis 
January 1974 Book 

Publisher: Springer-Verlag New York, Inc. 
Additional Information: full c ita t i on , abstract, refer ence s, cited by 

The Advanced Course took place from March 4 to 15, 1974 and was organized by the 
Mathematical Institute of the Technical University of Munich and the Leibniz Computing 
Center of the Bavarian Academy of Sciences, in co-operation with the European 
Communities, sponsored by the Ministry for Research and Technology of the Federal 
Republic of Germany and by the European Research Office, London. 




18 Implementation techniques: Barriers; friend or foe? 
y^v Stephen M. Blackburn, Antony L. Hosking 

>^ October 2004 Proceedings of the 4th international symposium on Memory 
management ISMM '04 

Publisher: ACM Press 

Full text available: «S| pdf(1 37.10 KB ) A^ 1110031 Information: f ull citation , abstract , ref erences , citings, index 

" terms 

Modern garbage collectors rely on read and write barriers imposed on heap accesses by 
the mutator, to keep track of references between different regions of the garbage 
collected heap, and to synchronize actions of the mutator with those of the collector. It 
has been a long-standing untested assumption that barriers impose significant overhead 
to garbage-collected applications. As a result, researchers have devoted effort to 
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development of optimization approaches for elimination of unnecessary ... 
Keywords: garbage collection, java, memory management, write barriers 

19 Nonblockin g memory mana g ement su pport for d ynamic-sized data structures 
^ Maurice Herlihy, Victor Luchangco, Paul Martin, Mark Moir 

May 2005 ACM Transactions on Computer Systems (TOCS), volume 23 issue 2 

Publisher: ACM Press 

Full text available* l p| pdf(944 89 KB) Ad^' 0081 Information: fu ll citation , abstract, references, citings, index 
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Conventional dynamic memory management methods interact poorly with lock-free 
synchronization. In this article, we introduce novel techniques that allow lock-free data 
structures to allocate and free memory dynamically using any thread-safe memory 
management library. Our mechanisms are lock-free in the sense that they do not allow a 
thread to be prevented from allocating or freeing memory by the failure or delay of other 
threads. We demonstrate the utility of these techniques by showing how to m ... 

Keywords: Multiprocessors, concurrent data structures, dynamic data structures, 
memory management, nonblocking synchronization 



20 Session 1: S afe mem or y reclamation for dynamic lock-free ob j ects usin g atomic 
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Maged M. Michael 

July 2002 Proceedings of the twenty-first annual symposium on Principles of 
distributed computing PODC '02 

Publisher: ACM Press 

Full text available:^ pdfd. 13 MB ) Additional Information: full citation , abstract , references , citings 

A major obstacle to the wide use of lock-free data structures, despite their many 
performance and reliability advantages, is the absence of a practical lock-free method for 
reclaiming the memory of dynamic nodes removed from dynamic lock-free objects for 
arbitrary reuse, The only prior lock-free memory reclamation method depends on the 
DCAS atomic primitive, which is not supported on any current processor architecture. 
Other memory management methods are blocking, require special operating system ... 
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